Query processing in distributed, taxonomy-based information sources
نویسندگان
چکیده
We address the problem of answering queries over a distributed information system, storing objects indexed by terms organized in a taxonomy. The taxonomy consists of subsumption relationships between negation-free DNF formulas on terms and negation-free conjunctions of terms. In the first part of the paper, we consider the centralized case, deriving a hypergraph-based algorithm that is efficient in data complexity. In the second part of the paper, we consider the distributed case, presenting alternative ways implementing the centralized algorithm. These ways descend from two basic criteria: direct vs. query re-writing evaluation, and centralized vs. distributed data or taxonomy allocation. Combinations of these criteria allow to cover a wide spectrum of architectures, ranging from client-server to peer-to-peer. We evaluate the performance of the various architectures by simulation on a network with O(10) nodes, and derive final results. An extensive review of the relevant literature is finally included.
منابع مشابه
Query Processing in a P2P Network of Taxonomy-based Information Sources
In this study we address the problem of answering queries over a peer-to-peer system of taxonomy-based sources. A taxonomy states subsumption relationships between negation-free DNF formulas on terms and negation-free conjunctions of terms. To the end of laying the foundations of our study, we first consider the centralized case, deriving the complexity of the decision problem and of query eval...
متن کاملA Unifying Framework for Flexible Information Access in Taxonomy-Based Sources
A taxonomy-based source consists of a taxonomy and a database storing objects that are indexed in terms of the taxonomy. For this kind of sources, we describe a flexible interaction scheme that allows users to retrieve the objects of interest without having to be familiar with the terms of the taxonomy or with the supported query language. Specifically we describe an interaction manager whose f...
متن کاملQuality-adaptive Query Processing
For non-collaborative data sources, both cost estimate-based optimization and quality-driven query processing are difficult to achieve because the sources do not export cost information nor data quality indicators. In this paper, we first propose an expressive query language extension using QML syntax for defining in a flexible way dimensions, metrics of data quality and data source quality. We...
متن کاملKeyword search across distributed heterogenous structured data sources
Many applications and users require integrated data from multiple, distributed, heterogeneous (semi-) structured sources. Sources are relational databases, XML databases, or even structured Web resources. Mediator systems represent one class of solutions for data integration. They provide a uniform view and uniform way to query the virtually integrated data. As data resides in the local sources...
متن کاملAbduction for Accessing Information Sources
We consider a general form of information sources, consisting of a set of objects classified by terms arranged in a taxonomy. The query-based access to the information stored in sources of this kind, is plagued with uncertainty, due, among other things, to the possible linguistic mismatch between the user and the object classification. To overcome this uncertainty in all situations in which the...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- CoRR
دوره abs/1109.2425 شماره
صفحات -
تاریخ انتشار 2011